Lexical Coverage Issues for Speech Recognition in Indian Languages∗

نویسنده

  • Gopala Krishna Anumanchipalli
چکیده

This report investigates issues of lexical coverage in Indian languages. More specifically, a parallel analysis of Out-of-Vocabulary words is made in Telugu and Tamil. Although generic, this study is focussed on understanding the morphological aspects in these languages as necessary for speech recognition. The observations reveal that morphological analysis and preprocessing can increase the lexical coverage by over 50%, thereby bringing them closer to the numbers in English.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Recognition of European Languages

A basic overview is presented of the main ongoing efforts in large vocabulary, continuous speech recognition (LVCSR) for European languages. We address issues in acoustic modeling, lexical representation, and language modeling for several European languages, as well as issues in comparative evaluation.

متن کامل

The Relationship between Syntactic and Lexical Complexity in Speech Monologues of EFL Learners

: This study aims to explore the relationship between syntactic and lexical complexity and also the relationship between different aspects of lexical complexity. To this end, speech monologs of 35 Iranian high-intermediate learners of English on three different tasks (i.e. argumentation, description, and narration) were analyzed for correlations between one measure of sy...

متن کامل

Voice InputlOutput Systems for Indian Languages

In this paper an overview of problems and prospects of voice input/output to a computer are discussed. Current attempts to provide speech input/output facilities to a computer are described. The scope of speech recognition problem is defined. Issues involved in the design of text-to-speech and speech-to-text systems are discussed. Since any sophisticated voice input/output system uses several l...

متن کامل

Multilingual Speech Recognition for Information Retrieval in Indian Context

This paper analyzes various issues in building a HMM based multilingual speech recognizer for Indian languages. The system is originally designed for Hindi and Tamil languages and adapted to incorporate Indian accented English. Language-specific characteristics in speech recognition framework are highlighted. The recognizer is embedded in information retrieval applications and hence several iss...

متن کامل

A Lexical Knowledge Driven Manner Based Speech Recognition Model

The emergence of speech as a more direct means of interaction with the computers, promises the possibilities of leap-frogging into development of tomorrow’s interfaces in Indian Languages for IT to Mass. Speech recognition is one of the most challenging speech technologies, for the development of speech mode man-to-machine communication. Speech recognition is the process of converting an acoust...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007